首页> 外文OA文献 >Effects of different missing data imputation techniques on the performance of undiagnosed diabetes risk prediction models in a mixed-ancestry population of South Africa

【2h】

Effects of different missing data imputation techniques on the performance of undiagnosed diabetes risk prediction models in a mixed-ancestry population of South Africa

机译：南非混合血统人群中不同缺失数据插补技术对未诊断的糖尿病风险预测模型性能的影响

代理获取

本网站仅为用户提供外文OA文献查询和代理获取服务，本网站没有原文。下单后我们将采用程序或人工为您竭诚获取高质量的原文，但由于OA文献来源多样且变更频繁，仍可能出现获取不到、文献不完整或与标题不符等情况，如果获取不到我们将提供退款服务。请知悉。

页面导航

摘要
著录项
引文网络
相似文献
相关主题

摘要

BACKGROUND: Imputation techniques used to handle missing data are based on the principle of replacement. It is widely advocated that multiple imputation is superior to other imputation methods, however studies have suggested that simple methods for filling missing data can be just as accurate as complex methods. The objective of this study was to implement a number of simple and more complex imputation methods, and assess the effect of these techniques on the performance of undiagnosed diabetes risk prediction models during external validation. METHODS: Data from the Cape Town Bellville-South cohort served as the basis for this study. Imputation methods and models were identified via recent systematic reviews. Models’ discrimination was assessed and compared using C-statistic and non-parametric methods, before and after recalibration through simple intercept adjustment. RESULTS: The study sample consisted of 1256 individuals, of whom 173 were excluded due to previously diagnosed diabetes. Of the final 1083 individuals, 329 (30.4%) had missing data. Family history had the highest proportion of missing data (25%). Imputation of the outcome, undiagnosed diabetes, was highest in stochastic regression imputation (163 individuals). Overall, deletion resulted in the lowest model performances while simple imputation yielded the highest C-statistic for the Cambridge Diabetes Risk model, Kuwaiti Risk model, Omani Diabetes Risk model and Rotterdam Predictive model. Multiple imputation only yielded the highest C-statistic for the Rotterdam Predictive model, which were matched by simpler imputation methods. CONCLUSIONS: Deletion was confirmed as a poor technique for handling missing data. However, despite the emphasized disadvantages of simpler imputation methods, this study showed that implementing these methods results in similar predictive utility for undiagnosed diabetes when compared to multiple imputation.

机译：背景：用于处理缺失数据的插补技术基于替换原理。广泛主张多重插补优于其他插补方法，但是研究表明，用于填充缺失数据的简单方法与复杂方法一样准确。这项研究的目的是实施多种简单和更复杂的估算方法，并在外部验证期间评估这些技术对未诊断的糖尿病风险预测模型的效果。方法：来自开普敦贝尔维尔南部队列的数据作为该研究的基础。通过最近的系统评价确定了估算方法和模型。在通过简单截距调整进行重新校准之前和之后，使用C统计和非参数方法评估并比较了模型的辨别力。结果：该研究样本包括1256人，其中173人因先前诊断为糖尿病而被排除在外。在最终的1083名个人中，有329名（30.4％）缺少数据。家族史中丢失数据的比例最高（25％）。结果的归因（未确诊的糖尿病）在随机回归归因中最高（163人）。总体而言，删除导致最低的模型性能，而简单的估算得出的剑桥糖尿病风险模型，科威特风险模型，阿曼糖尿病风险模型和鹿特丹预测模型的C统计量最高。对于鹿特丹预测模型，多重插补只产生了最高的C统计量，并通过更简单的插补方法进行了匹配。结论：删除被确认为处理丢失数据的一种较差的技术。然而，尽管强调了简化插补方法的弊端，但这项研究表明，与多次插补相比，实施这些方法对未诊断的糖尿病具有相似的预测效用。

著录项

作者
Masconi, Katya L; Matsha, Tandi E; Erasmus, Rajiv T; Kengne, Andre P;
展开▼
作者单位

展开▼
年度 2015
总页数
原文格式 PDF
正文语种 eng
中图分类

相似文献

外文文献
中文文献
专利

1. 空腹和2小时血糖指标诊断糖尿病在不同亚洲人群中的比较 [J] . 黄毅, 李元红中德临床肿瘤学杂志（英文版） . 2001,第004期
2. Independent external validation and comparison of prevalent diabetes risk prediction models in a mixed-ancestry population of South Africa [J] . Katya Masconi, Tandi E. Matsha, Rajiv T. Erasmus, Diabetology and Metabolic Syndrome . 2015,第S1期

机译：南非混合血统人群中独立的外部验证和流行的糖尿病风险预测模型的比较
3. Validation of two prediction models of undiagnosed chronic kidney disease in mixed-ancestry South Africans [J] . Amelie Mogueo, Justin B Echouffo-Tcheugui, Tandi E Matsha, BMC Nephrology . 2015,第1期

机译：混合祖先南非人未确诊慢性肾病两种预测模型的验证
4. Effect of missing data on performance of learning algorithms for hydrologic predictions: Implications to an imputation technique [J] . M. Kashif Gill, Tirusew Asefa, Yasir Kaheil, Water resources research . 2007,第7期

机译：缺失数据对水文预测学习算法性能的影响：对插补技术的影响
5. Importance of Recalibrating Models for Type 2 Diabetes Onset Prediction: Application of the Diabetes Population Risk Tool on the Health and Retirement Study [C] . Martina Vettoretti, Enrico Longato, Barbara Di Camillo, Annual International Conference of the IEEE Engineering in Medicine and Biology Society . 2018

机译：2型糖尿病发病预测的重新校准模型的重要性：糖尿病人群风险工具在健康和退休研究中的应用
6. Automated Data Imputation: Extending Low Rank Matrix Imputation Techniques For Statistical Prediction Modeling [D] . Page, Milo Tyrus 2018

机译：自动化数据插补：扩展用于统计预测建模的低秩矩阵插补技术
7. Effects of Different Missing Data Imputation Techniques on the Performance of Undiagnosed Diabetes Risk Prediction Models in a Mixed-Ancestry Population of South Africa [O] . Katya L. Masconi, Tandi E. Matsha, Rajiv T. Erasmus, -1

机译：南非混合祖先人群中不同数据丢失归类技术对未诊断糖尿病风险预测模型性能的影响
8. Effects of Different Missing Data Imputation Techniques on the Performance of Undiagnosed Diabetes Risk Prediction Models in a Mixed-Ancestry Population of South Africa. [O] . Katya L Masconi, Tandi E Matsha, Rajiv T Erasmus, 2015

机译：不同缺失数据插补技术对南非云南人群未确诊糖尿病风险预测模型的影响。

Effects of different missing data imputation techniques on the performance of undiagnosed diabetes risk prediction models in a mixed-ancestry population of South Africa

摘要

著录项

引文网络

相似文献

相关主题

期刊订阅